10 research outputs found

    AIOps for a Cloud Object Storage Service

    Full text link
    With the growing reliance on the ubiquitous availability of IT systems and services, these systems become more global, scaled, and complex to operate. To maintain business viability, IT service providers must put in place reliable and cost efficient operations support. Artificial Intelligence for IT Operations (AIOps) is a promising technology for alleviating operational complexity of IT systems and services. AIOps platforms utilize big data, machine learning and other advanced analytics technologies to enhance IT operations with proactive actionable dynamic insight. In this paper we share our experience applying the AIOps approach to a production cloud object storage service to get actionable insights into system's behavior and health. We describe a real-life production cloud scale service and its operational data, present the AIOps platform we have created, and show how it has helped us resolving operational pain points.Comment: 5 page

    Mostly Concurrent Garbage Collection Revisited

    No full text
    The mostly concurrent garbage collection was presented in the seminal paper of Boehm et al. With the deployment of Java as a portable, secure and concurrent programming language, the mostly concurrent garbage collector turned out to be an excellent solution for Java's garbage collection task. The use of this collector is reported for several modern production Java Virtual Machines and it has been investigated further in academia

    Mostly accurate stack scanning

    No full text
    Permission is granted for noncommercial reproduction of the work for educational or research purposes

    A parallel, incremental, mostly concurrent garbage collector for servers

    No full text
    Multithreaded applications with multi-gigabyte heaps running on modern servers provide new challenges for garbage collection (GC). The challenges for “server-oriented ” GC include: ensuring short pause times on a multi-gigabyte heap while minimizing throughput penalty, good scaling on multiprocessor hardware, and keeping the number of expensive multi-cycle fence instructions required by weak ordering to a minimum. We designed and implemented a collector facing these demands building on the mostly concurrent garbage collector proposed by Boehm et al. Our collector incorporates new ideas into the original collector. We make it parallel and incremental; we employ concurrent low-priority background GC threads to take advantage of processor idle time; we propose novel algorithmic improvements to the basic mostly concurrent algorithm improving its efficiency and shortening its pause times; and finally, we use advanced techniques, such as a low-overhead work packet mechanism to enable full parallelism among the incremental and concurrent collecting threads and ensure load balancing. We compared the new collector to the mature, well-optimized, parallel, stop-the-world marksweep collector already in the IBM JVM. When allowed to run aggressively, using 72 % of the CPU utilization during a short concurrent phase, our collector prototype reduces the maximum pause time from 161ms to 46ms while only losing 11.5 % throughput when running the SPECjbb2000 benchmark on a 600 MB heap on an 8-way PowerPC 1.1 GHz processors. When the collector is limited to a non-intrusive operation using only 29 % of the CPU utilization, the maximum pause time obtained is 79ms and the loss in throughput is 15.4%

    Dynamic Slice Scaling Mechanisms for 5G Multi-domain Environments

    No full text
    Network slicing is an essential 5G innovation whereby the network is partitioned into logical segments, so that Communication Service Providers (CSPs) can offer differentiated services for verticals and use cases. In many 5G use cases, network requirements vary over time and CSPs must dynamically adapt network slices to satisfy the contractual network slice QoS, cooperating and using each others’ resources, e.g. when resources of a single CSP are not sufficient or suitable to maintain all it’s current SLAs. While this need for dynamic cross-CSP cooperation is widely recognized, realization of this need is not yet possible due to gaps both in business processes and in technical capabilities. In this paper, we present a 5GZORRO approach to dynamic cross-CSP slice scaling. Our approach both enables CSPs to collaborate, providing security and trust with smart multi-party contracts, and facilitates thus achieved collaboration to enable resource sharing across multiple administrative domains, either during slice establishment or when already existing slice needs to expand or shrink. Our approach allows automating both business and technical processes involved in dynamic lifecycle management of cross-CSP network slices, following ETSI’s Zero-Touch Network and Service Management (ZSM) closed-loop architecture, and relying on resource-sharing Marketplace, Distributed Ledger (DL), and Operational Data Lake. We show how this approach is realized in truly Cloud Naive way, with Kubernetes as both business and technical cross-domain orchestrator. We then showcase applicability of the proposed solution for dynamic scaling of Content Delivery Network (CDN) service

    Implementing an On-the-fly Garbage Collector for Java

    No full text
    Java uses garbage collection (GC) for the automatic reclamation of computer memory no longer required by a running application. GC implementations for Java Virtual Machines (JVM) are typically designed for single processor machines, and do not necessarily perform well for a server program with many threads running on a multiprocessor. We designed and implemented an on-the-fly GC, based on the algorithm of Doligez, Leroy and Gonthier [13, 12] (DLG), for Java in this environment. An on-the-fly collector, a collector that does not stop the program threads, allows all processors to be utilized during collection and provides uniform response times. We extended and adapted DLG for Java (e.g., adding support for weak references) and for modern multiprocessors without sequential consistency, and added performance improvements (e.g., to keep track of the objects remaining to be traced). We compared the performance of our implementation with stop-the-world mark-sweep GC. Our measurements show th..
    corecore